Principled Induction of Phrasal Bilexica
نویسندگان
چکیده
We aim to replace the long and complicated, pipeline employed to produce probabilistic phrasal bilexica with a theoretically principled, grammar based, approach. To this end, we introduce a learning regime to learn a phrasal grammar equivalent to linear transduction grammars. The stochastic version of this new grammar type also has the property that the set of biterminals constitute a natural probability distribution, making it similar to a probabilistic translation lexicon. Since we learn a phrasal grammar, we are, in effect, learning a probabilistic phrasal bilexicon. As a proof of concept, we show that phrasal bilexica, induced in this manner, can be used to improve the performance of a traditional phrase-based SMT system.
منابع مشابه
Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica Extraction
We introduce a new type of transduction grammar that allows for learning of probabilistic phrasal bilexica, leading to a significant improvement in spoken language translation accuracy. The current state-of-the-art in statistical machine translation relies on a complicated and crude pipeline to learn probabilistic phrasal bilexica—the very core of any speech translation system. In this paper, w...
متن کاملThe Effect of Conceptual Metaphor Awareness on Learning Phrasal Verbs by Iranian Intermediate EFL Learners
The ability to comprehend and produce phrasal verbs, as lexical chunks or groups of words which are commonly found together, is an important part of language learning. This study investigates the effect of ‘conceptual metaphor awareness’, as a newly developed technique in Cognitive Linguistics, on learning phrasal verbs by Iranian intermediate EFL learners. To meet this objective, two intact ho...
متن کاملFrom Finite-State to Inversion Transductions: Toward Unsupervised Bilingual Grammar Induction
We report a wide range of comparative experiments establishing for the first time contrastive foundations for a completely unsupervised approach to bilingual grammar induction that is cognitively oriented toward early category formation and phrasal chunking in the bootstrapping process up the expressiveness hierarchy from finite-state to linear to inversion transduction grammars. We show a cons...
متن کاملA principled Cognitive Linguistics account of English phrasal verbs with up and out *
Many attempts have been made to discover some systematicity in the semantics of phrasal verbs. However, most research has investigated the semantics of particles exclusively; no study has examined how the multiple meanings of the verb also contribute to the meanings of phrasal verbs. The current corpus-based (COCA) study advances the research on phrasal verbs by examining the interaction of the...
متن کاملApproach to Automatic Translation Template Acquisition Based on Unannotated Bilingual Grammar Induction
In this paper, we propose a new approach which can automatically acquire translation templates from the unannotated bilingual spoken language corpora in the domain of travel information accessing. In the approach, two basic algorithms named grammar induction algorithm and dynamic programming algorithm are adopted. Our approach is an unsupervised, statistical, data-driven method which avoids the...
متن کامل